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ABSTRACT 


Big data is a large amount of data which is hard to handle by on hand systems. It requires new structures, algorithms and techniques. As data increases as per volume, 
dark data also will increase. Artificial Bee Colony algorithm is a part of Swarm Intelligence. It is based on how honey bees are working to find out their food sources. In 
Big Data there is distributed environment so required sources may be on different places. During process the data these data sources have to find out from different 
places and analyze a one system. This requires calculation which can help us to find out best option for our required data sources. ABC algorithm is used to overcome 
limitations of ant colony algorithm. In ant colony initialization will be repeat from starting point in case of failure. In bee colony optimization initialization happens 
only once. Itis used to find out required data source based on parameters out of multiple data sources. Thus, artificial bee colony algorithm can be used to find out best 
data sources. We can store these derived data sources on cloud for further processing. Bee colony algorithm generally used in data mining and networking field. It can 
be used for Big Data for identifying data resources. 
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1. INTRODUCTION 

Here study is about to combine Big data with artificial bee colony optimization. 
Artificial bee colony optimization is used in networking and data mining. In net- 
working field, artificial bee colony optimization 1s used to find out best destina- 
tion address via shortest path. Artificial bee colony optimization was developed 
in 2005 and its main characteristic is to find out best data source from given mul- 
tiple data sources. It is a part of swarm intelligence and discovered to overcome 
limitations of an ant colony optimization. It works on probabilities based on fit- 
ness of sources/destinations. [1] Fitness is decided based on different parame- 
ters. Source satisfying maximum parameters 1s most useful and sufficient source. 
Probability is ratio of fitness of any source to total fitness of all sources within a 
given system. Source with highest probability is chosen for process first. 


Potential 
‘Eorager ‘ 


In Artificial Bee Colony (ABC) optimization behavior of honey bees is taken as a 
reference. There are main 4 types of bees. [1] 


¢ Queen(it lives always in a hive) 


¢ Onlooker(observe work of workers and scouts) 


Worker(find food sources initially) 


Scout(once food source of any worker is exhausted then it becomes a scout 
and try to find other) 


Here is method that how they are finding food source. [1] 
¢ Worker: worker bees are traveling around and try to find a food source. 


Once they got a food source, all bees informs each other by performing dance 
called “waggle dance.” 


Scout: once a food source of any bee is exhausted by other bees then it 
becomes a scout and go to find a new source. Thus we can say that scout is 
nothing but an experienced worker bee. 





Fig. 1.1 The behaviour of honey bee foraging for nectar [8] 


¢ Onlooker: onlooker observes all these activities of scouts and workers by 


staying nearer of hive. Initially routes to travel for all bees are also decided 
by onlooker. Workers give all information about food source to onlooker so 
onlooker will send rest all bees to that direction only. 


2. LITERATURE REVIEW 

Big data has very large amount of data and sometimes clusters or classes created 
from it also may be very large. In such situation if we want to apply ABC algo- 
rithm on it, we should create an environment in which our optimization process 
can work on large data sets. For this purpose we can chose to work on cloud plat- 
form or other platforms like Hadoop and Spark. Generally HDFS and spark can 
handle big data, but if we want to perform ABC optimization on any data set tem- 
porarily then we can use cloud storage as SaaS(software as a service), 
PaaS(platform as a service) and IaaS(infrastructure as a service). [3] In this way 
we can store clusters on cloud with some predefined fitness to perform opera- 
tions to find out the best one. 


ABC 1s well suited for general assignment problem, cluster analysis, constrained 
problem optimization, structural optimization, and advisory system. It has also 
been applied to software engineering for software testing and parameter estima- 
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tion in software reliability growth models. Thus ABC is efficient applicable algo- 
rithm for rising technologies. [1] 


Cloud computing is useful to process ABC optimization. In such cases we can 
load data sets through VMs. ABC helps in cloud computing in case of load bal- 
ancing. ABC can identify only required best data source and load it n cloud to 
reduce load. Whenever we want to perform more number of processes, ABC also 
can help in VM migration. Memory utilization and processing speed can be man- 
age and improve by using ABC optimization. [2] 


Big data consists large quantity of data and still it increasing day by day. Opera- 
tions perform on it also requires high capacity of processing, high level of mem- 
ory utilization and resource migrations. In this way cloud computing 1s like ray of 
hope. Cloud storage provides facilities to store and process data virtually without 
any load on main processing system. Data we want to analyze for ABC optimiza- 
tion also can put on cloud environment temporarily. [3] In fact we can say that 
today Big Data and cloud computing increases in parallel ways and they may also 
become very important supports for each other. 


Here is way how ABC used in wireless communication. It is suitable for wireless 
sensor network. Here, nodes in same communication station will be divided into 
clusters. Each cluster has its own worker bees (communicating nodes) and 
onlooker (cluster head). Cluster head can act as an interpreter with other cluster 
heads from different clusters for communication purpose. They can act same as 
honey bees to find out their destination nodes.[4] 


Ant colony optimization is useful for smaller distance and smaller datasets only. 
First of all ants will be initialized. Then some out of multiple ants travelling 
through multiple paths as a worker to find sources and leave pheromones after 
themselves and other ants can follow them by sensing pheromones left by previ- 
ous ants. Thus all ants can reach to required food source.[5]| 


3. RELATED WORKS 

Ant colony optimization is a part of swarm intelligence and itis generally used 
in wireless communication system and data mining where distance and data is 
short. It is based on how ants are working to find out their food sources. But there 
are some limitations in ant colony optimization. Like if one ant will found bad 
source then all other ants will be misguide by it. In other hand if found source is 
not sufficient hen re-initialization of ants is must. Thus it is more time consuming 
process so it can use only in case on less distance or less amount of data.[5] 


initialize pheromones and system 


parameters 


Generates m ants and place on 


graph randomly 


For each nt construct a subset 
of feature using transition rule 


Upadate global best subset anc 
local best subset for any ant 








Figure 1. Flow chart of ant colony optimization 


4. PERFORMANCE EVALUATION 

Bee colony optimization is a part of swarm intelligence. Before ABC introduced, 
ant colony optimization was popular to find out best source from given environ- 
ment. Ant colony optimization is works based on ants' behavior to find out 
sources. [5] Characteristics of ant colony are.. 
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¢ Suitable for smaller size of data 
¢ Slow to deliver result 
¢ Less accurate in case of multiple data sources 


¢  Re-initialization is required if delivered data source is not sufficient. 


Psudo code example for ant colony optimization [11] 

Proceure ACO-Metaheuristic 

While(not-termination) 

generateSolutions() 

daemonActions() 

pheromoeUpdate() 

end while 

end procedure 
In ant colony optimization, initially out of all ants some ants starts travelling 
through different edges to find out sources. Each ant spreads pheromones so that 
rest all ants can follow that path. In this way each ant leaves pheromones to guide 
rest all ants. If any path stays ideal for some time without travelling of any ant, 
pheromones evaporate automatically. Thus, path with strongest pheromones is 
declared as a best path to destination. But in this case destination decides only 
based on how first worker ant found source. Ant finding a source in unaware from 
probability of other sources. Thus if once wrong source is found, rest all ants will 
be misguided to it without finding other options. This is a main disadvantage of 


ant colony optimization. In this way ant colony optimization is less accurate in 
case of larger number of data sources. 


To overcome limitations of ant colony optimization, artificial bee colony algo- 
rithm is introduced. It is based on method of honey bees for finding sources. The 
employed bees share the information about their food sources with onlooker bees 
after all of them complete the search process. An onlooker bee evaluates the nec- 
tar information taken from all employed bees and chooses a food source with a 
probability related to its nectar amount by Eq. known as roulette wheel selection 
method which provides better candidates to have a greater chance of being 
selected. [4] 

Characteristics of artificial bee colony optimization are... 

¢ Suitable for larger size of data 

¢ Fastto deliver result as compare to ant colony optimization 

¢ Moreaccuracy also in case of multiple data sources 

¢ Re-initialization isn't required if delivered data source is not sufficient 

¢ Itcaneasily accept new data source 

To choose data source based on fitness they should be clustered w.r.t. some 
parameters. Parameters used to decide fitness of any data source are different for 
different data source. It means based on different cased it 1s necessary to choose 


different parameters. Purpose of finding data source will be helpful to define 
parameters. 


ALGORITHM [4] 

Generate initial population Xi, [=1...SN 
Evaluate the population 

Set cycle to | 

Repeat 

FOR each employed bee 

Produce new solutions vi by using 
Calculate the fitness 

Apply the greedy selection process 

FOR each onlooker bee 


SF Fe) ae ee dS 


— 
oS 


. Choose a solution xi depending on pi 


— 
— 


. Produce new solutions vi 


— 
NO 


. Calculate the fitness 

13. Apply the greedy selection process 

14. Ifthere is an abandoned solution then 

15. Replace it with anew solution produced by a scout using 
16. Memorize the best solution achieved so far 

17. cycle=cycle? 1 

18. Until cycle=MCN 
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The employed bees share the information about their food sources with onlooker With Applications 53 (2016) 
bees after all of them complete the search process. An onlooker bee evaluates the Websites: 

nectar information taken from all employed bees and chooses a food source with 
a probability related to its nectar amount by equation known as roulette wheel 
selection method which provides better candidates to have a greater chance of 


[ 
being selected. [13] http://mf.erciyes.edu.tr/abc/index.htm 
[14] https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms 


[11] https://en.wikipedia.org/wiki/Artificial_bee_colony_algorithm 
12] https://en.wikipedia.org/wiki/Swarm_ intelligence 
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Figur 2. Flowchart of artificial bee colony optimization 


5. CONCLUSION 

Here, ABC optimization can help us to find out required data source from large 
amount of data sources, ABC also can help to arrange output of Big Data in par- 
ticular pattern. If clusters, analyzed output of data sets, selected data sets will be 
on cloud storage or on different location, ABC optimization is easy way to deter- 
mine them. Spark platform is useful to handle large data set in the form of clus- 
ters. ABC is good in optimization. ABC optimization will be different in different 
manners because fitness can be different in different cases due to different 
parameters. 


ACKNOWLEDGEMENTS 

This research was supported by Basic Science Research Program through the 
National Research Foundation of Korea(NRF) funded bythe Ministry of Edu- 
cation, Science and Technology(.2012R1A1A2044694). 


REFERENCES 

Researh Papers: 

[1] “Applications of Artificial Bee Colony Optimization Technique: Survey”, Kuldeep 
Singh Kaswan, Sunita Chaudhari, Kapil Sharma, 978-9-3805-4416-8/15/$31.00 c 
2015 IEEE 


[2] “Enhanced Bee Colony Algorithm for Efficient Load Balancing and Scheduling in 
Cloud”, K.R. Remesh Babu and Philip Samuel, Springer International Publishing Swit- 
zerland 2016 


[3] “Therise of “big data” on cloud computing: Review and open research issues”, Ibrahim 
Abaker Targio Hashem, Ibrar Yaqoob, Nor Badrul Anuar,Salimah Mokhtar , Abdullah 
Gani, Samee Ullah Khan, Information Systems 47 (2015) 98—115 


[4] “Cluster based wireless sensor network routing using artificial bee colony algorithm 
Dervis Karaboga, Selcuk Okdem, Celal Ozturk, Springer Science+Business Media, 
LLC 2012 


[5] “Optimization of the Running Speed of Ant Colony Algorithm with Address-based 
Hardware Method”, ElnazShafighFard, Khalil Monfaredi, ISSN: 2180 - 1843 Vol. 7 
No. | January - June 2015 


[6] “Distributed Virtualization Manager for KVM Based Cluster”, Mr. Uchit Gandhi, Mr. 
Mitul Modi, Ms. Mitali Raval, Mr. Paavan Maniar, Dr. Narendra Patel, Prof Kirti 
Sharma, Procedia Computer Science 79 (2016 ) 182—189, ScienceDirect 


[7] “Data Model for Big Data in Cloud Environment’, Imran Khan, S. K. Naqvi, Mansaf 
Alam, S.N.A Rizvi 


[8] ‘Research of Resource Allocation in Cloud Computing Based on Improved Dual Bee 
Colony Algorithm”, Wu Ju-Hua, International Journal of Grid Distribution Computing, 
Vol. 8, No.5, (2015), pp.117-126 


[9] “SAACO: A Self Addictive Ant Colony Optimization in Cloud Computing”, Weifeng 
Sun, Zhenxing Ji, Jianli Sun, Ning Zhang, Yan Hu, 2015 IEEE Fifth International Con- 
ference on Big Data and Cloud Computing 


[10] ’ Unsupervised probabilistic feature selection using ant colony optimization”, Behrouz 
Zamani Dadaneh , Hossein Yeganeh Markid, Ali Zakerolhosseini, Expert Systems 


International Education & Research Journal [IERJ] 360 


